Merge-Based Prototype Selection for Nearest Neighbor Classification

نویسندگان

  • Ramón A. Mollineda
  • Francesc J. Ferri
  • Enrique Vidal
چکیده

A generalized prototype-based classification scheme founded on hierarchical clustering is proposed. The basic idea is to obtain a condensed 1-NN classification rule by replacing a group of prototypes by a representative while approximately keeping its original classification abilities. The algorithm improves and generalizes previous works by explicitly introducing the concept of cluster and cluster consistency. Apart from the quality of the obtained sets, the proposed scheme permits a very efficient and flexible implementation by using geometric cluster properties and different intercluster measures. Moreover, the algorithm benefits from all the well-known results about hierarchical clustering regarding computational improvements and different intercluster measures. Empirical results demonstrate the merits of the proposed algorithm taking into account the size of the condensed sets of prototypes, the accuracy of the corresponding condensed 1-NN classification rule and the computing time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

Discriminative Learning of the Prototype Set for Nearest Neighbor Classification

The nearest neighbor rule is one of the most widely used models for classification and selecting a compact set of prototype instances is an important problem for its applications. Many existing approaches on the prototype selection problem rely on instance-based analyses and local criteria on the class distribution, which are intractable for numerical optimization techniques. In this paper, we ...

متن کامل

Support Vector Based Prototype Selection Method for Nearest Neighbor Rules

The Support vector machines derive the class decision hyper planes from a few, selected prototypes, the support vectors (SVs) according to the principle of structure risk minimization, so they have good generalization ability. We proposed a new prototype selection method based on support vectors for nearest neighbor rules. It selects prototypes only from support vectors. During classification, ...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

On the Selection of the Globally Optimal Prototype Subset for Nearest-Neighbor Classification

T nearest-neighbor classifier has been shown to be a powerful tool for multiclass classification. We explore both theoretical properties and empirical behavior of a variant method, in which the nearest-neighbor rule is applied to a reduced set of prototypes. This set is selected a priori by fixing its cardinality and minimizing the empirical misclassification cost. In this way we alleviate the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000